East Macedonia and Thrace
NLP for The Greek Language: A Longer Survey
Papantoniou, Katerina, Tzitzikas, Yannis
There is a wide variety of methods, tools and resources for processing text in the English language. However this is not the case for the Greek language even though it has a long documented history spanning at least 3,400 years of written records (including texts in syllabic script), and 28 centuries (Archaic period - new) of written text with alphabet [1, 2]. The over 2500 years literary tradition of Greek is also notable. To aid those that are interested in using, developing or advancing the techniques for Greek processing, in this paper we survey related works and resources organized in categories. We hope this collection and categorization of works to be useful for students and researchers interested in NLP tasks, Information Retrieval and Knowledge Management for the Greek language.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- (63 more...)
- Research Report (1.00)
- Overview (1.00)
- Media > News (1.00)
- Information Technology (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- (4 more...)
- Information Technology > Communications > Web (1.00)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Speech (1.00)
- (10 more...)
GDA: Generative Data Augmentation Techniques for Relation Extraction Tasks
Hu, Xuming, Liu, Aiwei, Tan, Zeqi, Zhang, Xin, Zhang, Chenwei, King, Irwin, Yu, Philip S.
Relation extraction (RE) tasks show promising performance in extracting relations from two entities mentioned in sentences, given sufficient annotations available during training. Such annotations would be labor-intensive to obtain in practice. Existing work adopts data augmentation techniques to generate pseudo-annotated sentences beyond limited annotations. These techniques neither preserve the semantic consistency of the original sentences when rule-based augmentations are adopted, nor preserve the syntax structure of sentences when expressing relations using seq2seq models, resulting in less diverse augmentations. In this work, we propose a dedicated augmentation technique for relational texts, named GDA, which uses two complementary modules to preserve both semantic consistency and syntax structures. We adopt a generative formulation and design a multi-tasking solution to achieve synergies. Furthermore, GDA adopts entity hints as the prior knowledge of the generative model to augment diverse sentences. Experimental results in three datasets under a low-resource setting showed that GDA could bring {\em 2.0\%} F1 improvements compared with no augmentation technique. Source code and data are available.
- Asia > China > Hong Kong (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Greece > East Macedonia and Thrace > Komotini (0.04)
- (3 more...)
Expressiveness and machine processability of Knowledge Organization Systems (KOS): An analysis of concepts and relations
Peponakis, Manolis, Mastora, Anna, Kapidakis, Sarantos, Doerr, Martin
This study considers the expressiveness (that is the expressive power or expressivity) of different types of Knowledge Organization Systems (KOS) and discusses its potential to be machine-processable in the context of the Semantic Web. For this purpose, the theoretical foundations of KOS are reviewed based on conceptualizations introduced by the Functional Requirements for Subject Authority Data (FRSAD) and the Simple Knowledge Organization System (SKOS); natural language processing techniques are also implemented. Applying a comparative analysis, the dataset comprises a thesaurus (Eurovoc), a subject headings system (LCSH) and a classification scheme (DDC). These are compared with an ontology (CIDOC-CRM) by focusing on how they define and handle concepts and relations. It was observed that LCSH and DDC focus on the formalism of character strings (nomens) rather than on the modelling of semantics; their definition of what constitutes a concept is quite fuzzy, and they comprise a large number of complex concepts. By contrast, thesauri have a coherent definition of what constitutes a concept, and apply a systematic approach to the modelling of relations. Ontologies explicitly define diverse types of relations, and are by their nature machine-processable. The paper concludes that the potential of both the expressiveness and machine processability of each KOS is extensively regulated by its structural rules. It is harder to represent subject headings and classification schemes as semantic networks with nodes and arcs, while thesauri are more suitable for such a representation. In addition, a paradigm shift is revealed which focuses on the modelling of relations between concepts, rather than the concepts themselves.
- Europe > Netherlands > South Holland > The Hague (0.04)
- Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (14 more...)
- Information Technology > Communications > Web > Semantic Web (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.92)
SKOS Concepts and Natural Language Concepts: an Analysis of Latent Relationships in KOSs
Mastora, Anna, Peponakis, Manolis, Kapidakis, Sarantos
The vehicle to represent Knowledge Organization Systems (KOSs) in the environment of the Semantic Web and linked data is the Simple Knowledge Organization System (SKOS). SKOS provides a way to assign a URI to each concept, and this URI functions as a surrogate for the concept. This fact makes of main concern the need to clarify the URIs' ontological meaning. The aim of this study is to investigate the relation between the ontological substance of KOS concepts and concepts revealed through the grammatical and syntactic formalisms of natural language. For this purpose, we examined the dividableness of concepts in specific KOSs (i.e. a thesaurus, a subject headings system and a classification scheme) by applying Natural Language Processing (NLP) techniques (i.e. morphosyntactic analysis) to the lexical representations (i.e. RDF literals) of SKOS concepts. The results of the comparative analysis reveal that, despite the use of multi-word units, thesauri tend to represent concepts in a way that can hardly be further divided conceptually, while Subject Headings and Classification Schemes - to a certain extent - comprise terms that can be decomposed into more conceptual constituents. Consequently, SKOS concepts deriving from thesauri are more likely to represent atomic conceptual units and thus be more appropriate tools for inference and reasoning. Since identifiers represent the meaning of a concept, complex concepts are neither the most appropriate nor the most efficient way of modelling a KOS for the Semantic Web.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (11 more...)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.66)